Stream-Order and Order-Statistics∗
نویسندگان
چکیده
When processing a data-stream in small space, how important is the order in which the data arrives? Are there problems that are unsolvable when the ordering is worst-case, that can typically be solved in practice? Does the role of ordering become less significant when we permit algorithms to make multiple passes over the data-stream if the stream is not re-ordered between each pass? If we consider the stream ordered by an adversary, what happens if we restrict the power of the adversary? We study these questions in the context of quantile estimation, one of the most studied problems in the data-stream model. Our specific results include an O(log n)-space, O(log log n)-pass algorithm for exact selection in a randomly ordered stream of n elements. This resolves an open question of Munro and Paterson [Theor. Comput. Sci., 23 (1980), pp. 315–323]. We then demonstrate an exponential separation between the random-order and adversarial-order models: using O(polylog n) space, exact selection requires Ω(log n/ log log n) passes in the adversarial-order model. This is established via a new bound on the communication complexity of a natural pointer-chasing style problem and, in contrast to previous results, applies to fully-general randomized algorithms. We also prove the first fully general lower-bounds in the random-order model: finding an n-approximate median in the single-pass random-order model with probability at least 9/10 requires Ω( √ n1−3δ/ log n) space. ∗Part of this work originally appeared in PODS 2006 [9] and ICALP 2007 [10]. †Dept. of Computer and Information Science, University of Pennsylvania. Email: [email protected]. Supported in part by an Alfred P. Sloan Research Fellowship and by NSF Awards CCF-0430376 and CCF-0644119. ‡Information Theory and Applications Center, University of California, San Diego. Email: [email protected]. Part of this work was done while the author was at the University of Pennsylvania.
منابع مشابه
Delay Jitter Correlation Analysis for Traffic Transmission on High Speed Networks
We study the time-dynamic behavior of delay jit-ter as captured by the autocorrelation function; the second-order statistics provide information relating to consecutive cell loss in real-time services. We derive an expression for delay jitter correlation of a stationary traac stream in an MMPP/M/1/K system with FIFO service discipline; use of a type of deterministic rate server yields the same ...
متن کاملDetecting Low Complexity Clusters by Skewness and Kurtosis in Data Stream Clustering
Established statistical representations of data clusters employ up to second order statistics including mean, variance, and covariance. Strategies for merging clusters have been largely based on intraand inter-cluster distance measures. The distance concept allows an intuitive interpretation, but it is not designed to merge from the viewpoint of probability distributions. We suggest an alternat...
متن کاملON AN INDEPENDENT RESULT USING ORDER STATISTICS AND THEIR CONCOMITANT
Let X1;X2;...;Xn have a jointly multivariate exchangeable normal distribution. In this work we investigate another proof of the independence of X and S2 using order statistics. We also assume that (Xi ; Yi); i =1; 2;...; n; jointly distributed in bivariate normal and establish the independence of the mean and the variance of concomitants of order statistics.
متن کاملStream Order and Order Statistics: Quantile Estimation in Random-Order Streams
When trying to process a data stream in small space, how important is the order in which the data arrive? Are there problems that are unsolvable when the ordering is worst case, but that can be solved (with high probability) when the order is chosen uniformly at random? If we consider the stream as if ordered by an adversary, what happens if we restrict the power of the adversary? We study thes...
متن کاملHigher Order Moments and Recurrence Relations of Order Statistics from the Exponentiated Gamma Distribution
Order statistics arising from exponentiated gamma (EG) distribution are considered. Closed from expressions for the single and double moments of order statistics are derived. Measures of skewness and kurtosis of the probability density function of the rth order statistic for different choices of r, n and /theta are presented. Recurrence relations between single and double moments of r...
متن کاملRecurrence Relations for Single and Product Moments of Generalized Order Statistics from pth Order Exponential Distribution and its Characterization
In this paper, we establish some recurrence relations for single and product moments of generalized order statistics from pth order exponential distribution. Further the results are deduced for the recurrence relations of record values and ordinary order statistics and using a recurrence relation for single moments we obtain characterization of pth order exponential distribution.
متن کامل